The solution to the last exercise in the Numpy Basics notebook introduces an important concept when working with NumPy: the axis. This indicates the particular dimension along which a function should operate (provided the function does something taking multiple values and converts to a single value).
Let's look at a concrete example with sum:
In [ ]:
    
# Convention for import to get shortened namespace
import numpy as np
    
In [ ]:
    
# Create an array for testing
a = np.arange(12).reshape(3, 4)
a
    
In [ ]:
    
# This calculates the total of all values in the array
np.sum(a)
    
In [ ]:
    
# Keep this in mind:
a.shape
    
In [ ]:
    
# Instead, take the sum across the rows:
np.sum(a, axis=0)
    
In [ ]:
    
# Or do the same and take the some across columns:
np.sum(a, axis=1)
    
In [ ]:
    
# Synthetic data
temp = np.random.randn(100, 50)
u = np.random.randn(100, 50)
v = np.random.randn(100, 50)
# Calculate the gradient components
gradx, grady = np.gradient(temp)
# Turn into an array of vectors:
# axis 0 is x position
# axis 1 is y position
# axis 2 is the vector components
grad_vec = np.dstack([gradx, grady])
print(grad_vec.shape)
# Turn wind components into vector
wind_vec = np.dstack([u, v])
# Calculate advection, the dot product of wind and the negative of gradient
# DON'T USE NUMPY.DOT (doesn't work). Multiply and add.
    
In [ ]:
    
# %load solutions/advection.py
    
In [ ]:
    
# Create some synthetic data representing temperature and wind speed data
np.random.seed(19990503)  # Make sure we all have the same data
temp = (20 * np.cos(np.linspace(0, 2 * np.pi, 100)) +
        50 + 2 * np.random.randn(100))
spd = (np.abs(10 * np.sin(np.linspace(0, 2 * np.pi, 100)) +
              10 + 5 * np.random.randn(100)))
    
In [ ]:
    
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot(temp, 'tab:red')
plt.plot(spd, 'tab:blue');
    
By doing a comparision between a NumPy array and a value, we get an array of values representing the results of the comparison between each element and the value
In [ ]:
    
temp > 45
    
We can take the resulting array and use this to index into the NumPy array and retrieve the values where the result was true
In [ ]:
    
print(temp[temp > 45])
    
So long as the size of the boolean array matches the data, the boolean array can come from anywhere
In [ ]:
    
print(temp[spd > 10])
    
In [ ]:
    
# Make a copy so we don't modify the original data
temp2 = temp.copy()
# Replace all places where spd is <10 with NaN (not a number) so matplotlib skips it
temp2[spd < 10] = np.nan
plt.plot(temp2, 'tab:red')
    
Can also combine multiple boolean arrays using the syntax for bitwise operations. MUST HAVE PARENTHESES due to operator precedence.
In [ ]:
    
print(temp[(temp < 45) & (spd > 10)])
    
In [ ]:
    
# Here's the "data"
np.random.seed(19990503)  # Make sure we all have the same data
temp = (20 * np.cos(np.linspace(0, 2 * np.pi, 100)) +
        80 + 2 * np.random.randn(100))
rh = (np.abs(20 * np.cos(np.linspace(0, 4 * np.pi, 100)) +
              50 + 5 * np.random.randn(100)))
# Create a mask for the two conditions described above
# good_heat_index = 
# Use this mask to grab the temperature and relative humidity values that together
# will give good heat index values
# temp[] ?
# BONUS POINTS: Plot only the data where heat index is defined by
# inverting the mask (using `~mask`) and setting invalid values to np.nan
    
In [ ]:
    
# %load solutions/heat_index.py
    
In [ ]:
    
print(temp[0])
    
We can also extract the first, fifth, and tenth elements:
In [ ]:
    
print(temp[[0, 4, 9]])
    
One of the ways this comes into play is trying to sort numpy arrays using argsort. This function returns the indices of the array that give the items in sorted order. So for our temp "data":
In [ ]:
    
inds = np.argsort(temp)
print(inds)
    
We can use this array of indices to pass into temp to get it in sorted order:
In [ ]:
    
print(temp[inds])
    
Or we can slice inds to only give the 10 highest temperatures:
In [ ]:
    
ten_highest = inds[-10:]
print(temp[ten_highest])
    
There are other numpy arg functions that return indices for operating:
In [ ]:
    
np.*arg*?